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The  best-known  unmanned  aerial  vehicles  (UAVs),  Predator  and  Global  Hawk, 
are  large,  multi-million  dollar  aircraft  managed  as  theater/national  assets.  With  synthetic 
aperture  radar  (SAR),  electro-optic/infrared  (EO/IR),  and  signals  intelligence  (SIGINT) 
payloads,  these  UAVs  have  proven  their  worth  in  battlefields  from  Bosnia  to  Afghanistan 
and  Iraq.  This  success  has  led  to  surge  in  proposed  UAV  missions  and  designs  using  a 
layered  approach  with  multiple  classes  of  UAVs  to  provide  persistent  narrow  and  wide 
ISR  (intelligence,  surveillance,  reconnaissance)  coverage.  Programs  such  as  the  Future 
Combat  System  (ECS)  include  a  large  role  for  tactical  UAVs,  small  UAVs,  and 
unmanned  ground  vehicles  (UGVs).  The  smaller,  cheaper  unmanned  vehicles  can  be 
deployed  at  the  brigade  or  company  level  to  “see  over  the  next  hill.”  With  many  vehicles 
and  many  sensors,  network  bandwidth  becomes  an  issue.  So  future  UAVs  will  include 
aided/automatic  target  recognition  (AiTR/ATR)  capabilities  to  reduce  both 
communication  bandwidth  and  latency. 

Large  UAVs  such  as  Global  Hawk  and  Predator  have  been  successful  using 
today’s  HPEC  solutions.  Global  Hawk  currently  uses  a  9U  VME  system  with  PowerPC 
processors  for  SAR  and  EO/IR  processing,  while  the  Predator  is  a  bit  smaller,  using  a  6U 
VME  system  for  TESAR  processing.  The  challenge  is  to  provide  similar  processing 
power  for  much  smaller  UAVs,  many  of  which  have  less  than  Vi  the  payload  weight  and 
14  the  volume  of  the  Predator  (see  examples  in  Table  1).  Note  that  only  a  small  portion  of 
the  payload  is  allocated  for  signal  and  image  processing.  For  example,  the  TESAR  image 
processor  on  the  Predator  is  just  55  pounds,  less  than  1/10  of  the  total  payload  weight. 
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Table  1:  UAV  Payloads 


UAV 

Global 

Predator 

Heron 

Hunter 

Eagle 

Eire- 

Sentry 

Dragon 

Dragon 

Hawk 

B 

A 

Eye 

scout 

Warrior 

Eye 

L,ength  (ft) 

44.4 

36 

26 

22 

17 

23 

8.4 

10 

3 

Wingspan  (ft) 

116 

66 

54 

29 

17 

20 

12.8 

9 

3.8 

Jeight  (ft) 

14 

9.5 

5.9 

5.6 

5.5 

9.5 

4 

5 

1 

!*ayload 
Weight  (lbs) 

1000 

800 

550 

250 

200 

200 

75 

35 

5 

Max  Altitude 

65k 

50k 

25k 

15k 

20k 

20k 

15k 

4k 

1.2k 

m 

EO/IR 

EO/IR 

EO/IR 

EO/IR 

EO/IR 

EO/IR 

EO/IR 

EO/IR 

EO/IR 

SAR 

SAR 

SAR 

SAR 

SAR 

SAR 

Censors 

ISAR 

ISAR 

ISAR 

ISAR 

ISAR 

ISAR 

SIGINT 

SIGINT 

SIGINT 

SIGINT 

SIGINT 

MTS 

MTS 

MTS 

MTS 

MTS 

MTS 

Endurance 

36 

36 

36 

10 

5 

4 

3 

3 

1 

hrs) 

Max  Airspeed 
I^kts) 

320 

220 

120 

100 

220 

120 

100 

70 

35 

In  the  past,  we  have  relied  on  Moore’s  Law  to  help  us  out.  We  eould  wait  a 
couple  of  years  and  the  technology  improvements  in  the  electronics  would  have  enabled 
significant  shrinking  of  size.  However,  we’ve  come  to  a  point  where  Moore’s  Law  effects 
still  increase  absolute  performance,  but  not  performance  per  Watt,  per  pound,  or  per 
cubic  foot.  Although  the  number  of  transistors  available  is  increasing,  the  power 
consumption  is  increasing  at  almost  the  same  rate  (see  figure  1).  The  increased 
infrastructure  to  handle  the  power  distribution  and  heat  extraction  incurs  a  penalty  in  size 
and  weight.  Alternative  approaches  are  needed. 

One  approach  is  to  leverage  field-programmable  gate  arrays  (FPGAs)  as 
programmable  processors.  For  some  front-end  signal  and  image  processing  functions, 
FPGAs  have  been  shown  to  provide  a  10-20  fold  performance  boost  over  a  PowerPC  G4 
processor.  However,  some  front-end  tasks,  like  filter  weight  computation,  and  most  back¬ 
end  processing  still  performs  much  better  on  a  PowerPC  processor.  Given  the  higher 
power  consumption  of  an  FPGA,  there  is  a  limit  on  the  number  of  FPGAs  that  can  be 
used  in  a  system.  In  trying  to  fit  the  most  processing  power  in  the  smallest  space  for  a 
given  application,  the  trick  is  not  only  trying  to  find  the  optimum  balance  between 
FPGAs  and  PowerPCs,  but  also  exactly  which  model  of  each  chip  to  choose. 


PowerPC  Performance/Watt 


■D 

o 

(D 


Figure  1 :  PowerPC  frequency  and  power  consumption. 

Most  evaluations  of  FPGA  chips  focus  on  the  number  of  logic  cells,  slices,  and  processor 
blocks.  An  example  comparison  of  Xilinx  FPGAs  is  shown  in  Figure  2.  For  embedded 
signal  and  image  processing  applications,  more  critical  elements  tend  to  be  the  number  of 
multiplier  blocks  and  the  size  of  the  block  RAM.  This  leads  to  different  component 
selection,  as  shown  in  Figure  3. 

The  slot  limitations  on  space-constrained  systems  also  lend  to  integration  of  the 
analog-to-digital  conversion  and  general  FO  with  the  processing.  This  is  especially 
important  for  multi-channel  systems.  That  sensor  FO  can  be  part  of  the  base-board  design 
along  with  processors  or  be  a  separate  mezzanine  card.  A  separate  mezzanine  card  gains 
board  real  estate  but  restricts  the  power  and  cooling  available  to  each  card. 

This  presentation  will  provide  a  detailed  set  of  trade-offs  in  computational 
capabilities,  FO  capabilities,  and  memory  capacities  distributed  between  FPGAs  and 
PowerPCs  for  sample  applications  of  SAR  image  formation  and  SIGENT  channelized 
receiver  throughout. 
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Figure  2:  Typical  comparison  of  FPGA  attributes. 
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Figure  3:  Focusing  on  RAM  and  multiplier  blocks  for  FPGA  computing. 
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•  COMINT/ESM 

•  Software  Radio 

•  Radar 

•  ELINT/ESM/RWR 

•  EO/IR  Imagery 


...  and  other  HPEC 
challenges,  such  as  ATR, 
to  reduce  sensor 
communication 
bandwidth/latency  needs 
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UAVs 

Helicopters 

Man-pack/Brief- 
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Platfoffl 


UAV 

Global 

Hawk 

Pred¬ 
ator  B 

Heron 

A 

Hunter 

Eagle 

Eye 

Fire- 

Scout 

Sentry 

Dragon 

Warrior 

Dragon  Eye 

Picture 

--Tt 

m 

Length  (ft) 

44.4 

36 

26 

22 

17 

23 

8.4 

10 

3 

Wingspan  (ft) 

116 

66 

54 

29 

17 

20 

12.8 

9 

3.8 

Height  (ft)  1 

1 14 

9.5 

5.9 

5.6 

5.5 

9.5 

4 

5 

- - 1 

Payioad  Weight 

1000 

800  1 

|550 

250 

200 

200 

75 

35 

5  1 

(lbs) 

Max  Altitude  (ft) 

65k 

50k 

25k 

15k 

20k 

20k 

15k 

4k 

1.2k 

Sensors  1 

1  EO/IR  SAR 

1  ISAR 

EO/IR 

SAR 

EO/IR 

SAR 

EO/IR 

SAR 

EO/IR 

SAR 

EO/IR  1 

SAR  1 

1  EO/IR 

EO/IR 

EO/IR  1 

1  SIGINT 

1  MTS 

ISAR 

SIGINT 

MTS 

ISAR 

SIGINT 

MTS 

ISAR 

MTS 

ISAR 

SIGINT 

MTS 

ISAR  1 

SIGINT  1 
MTS  1 

Endurance  (hrs) 

36 

36 

36 

10 

5 

4 

3 

3 

1 

Max  Airspeed 
(kts) 

320 

220 

120 

100 

220 

120 

100 

70 

35 

UAVs  height  is  very  smail;  tends  to  iead  to 
smaller  system  designs  than  6U  arrayed  on 
base  of  fuseiage/wings 
Payioad  weight  is  smaii,  thus  weight 
constrained  soiutions  are  demanded 
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UAVs  tend  to  fiy  fairiy  high.  A 
consequence  is  that  without  life  support 
environments  (no  man)  at  this  aititude, 
conduction  cooied  becomes  mandatory. 
Aii  traditionai  HPEC  appiications  are 
represented  on  aii  the  platforms. 


PowerPC  Performance/Watt 


Historically,  have  relied  on 
Moore’s  Law.  Could  wait  and 
technology  improvements  would 
enable  significant  miniaturization. 
However,  we  observed  increases 
in  absolute  performance  are 
accompanied  by  increases  in 
power,  and  by  consequence 
weight  and  volume. 


N 


Number  of  transistors  available  is 
increasing,  but  power  consumption 
is  increasing  at  almost  same  rate. 
Increased  infrastructure  to  handle 
power  distribution  and  heat 
extraction  incurs  a  penalty  in  size 
and  weight.  Alternative  approaches 
are  needed. 


One  approach:  leverage  field- 
programmable  gate  arrays  (FPGAs) 
as  programmable  processors. 


T3 

O 

o 


For  some  signal/image  processing  functions,  FPGAs  shown  to 
provide  a  1 0-20  fold  performance  boost  over  a  PowerPC  G4 
processor.  However,  some  tasks,  e.g.  filter  weight  computation, 
back-end  processing,  still  perform  better  on  a  PowerPC. 

In  trying  to  maximize  processing  power  in  smallest  space,  trick 
is  not  only  trying  to  find  optimum  balance  between  FPGAs  and 
PowerPCs,  but  also  exactly  which  model  of  each  chip  to 
choose. 
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•  The  popular  comparison.... 
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These  are  the  resources  most  often  receiving 
attention  when  people  look  at  Xilinx  parts 
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....But  what  really  matters 
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•  For  embedded  signal/image  processing  applications,  more  critical 
elements  tend  to  be  number  of  multiplier  blocks  and  block  RAM  size 

•  Leads  to  different  component  selection  favoring  Pro  range 
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6  slot=96  GFLOPS 
12slot=192 
GFLOPS 
20  slot=320 
GFLOPS 


Assumptions 

♦  FPGA=  Equivalent 
40-100  GFLOPS 

♦  500  MHz  PPC=4 
GFLOPS 


Small 


2x  1GHz  class  PPC  per 
board  or  2  FPGA  per 
board=> 

♦  2slot=96-216GFLOPS 

♦  4  slot=1 12-61 6  GFLOPS 

♦  8  slot=224-1 232  GFLOPS 

=>  Future  FPGA  + 
PPC  exploitation  on 
3U  better  than 
existing  6U 


2-4x 

processing  - 
same  system 
dimensions 


Future  PPC-only 
Solutions  I 


4x1.5  GHz  class  PPC  =  48 
GFLOPS  per  slot  => 

♦  6  slot=288  GFLOPS 

♦  12  slot=576  GFLOPS  • 

♦  20  slot=960  GFLOPS 

=>  PPC  exploitation  of  VITA  46 


4x  1  GHz  class  PPC  per 
board  or  2  FPGA  per 
board=> 

♦  6  slot=1 92-1 032  GFLOPS 

♦  12  slot=384-2232  GFLOPS 

♦  20  slot=640-3832  GFLOPS 

=>FPGA  +  PPC 
exploitation  on  VME 
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Slot  limitations  on  space- 
constrained  systems  aiso  iend 
to  integration  of  the  analog-to- 
digital  conversion  and  generai 
I/O  with  the  processing.  This  is 
especialiy  important  for  muiti- 
channei  systems. 


User 

Dispiay 


Sensor  I/O  can 
be  part  of  base¬ 
board  design, 
e.g.  tuner/ADC 
or  be  a 

mezzanine  card 
attached  to 
processors. 
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•  Example  ARC-210  Form 


0) 

CO 


Fitting  6  x  3U  cPCI 
siots  ieaves  total 
remaining  space  of 

♦  Width  r’(20%) 

♦  Height  1.7”(>30%) 

♦  Length  6.3”(>35%) 


0.8” 
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•  MCP3  FCN  +  DRTi  Analogue 


dimensions  to  scaie 


‘Warrior’  ARC-210 
mallest  ARC-210 


Space  for  power 
supplies  and 
connectors  in 
back  (or  front) 


Space  for  Cooling  top  and 
bottom  (per  conduction 
needs) 


RF 

♦  1  channel  at  70 
MSPS  14  bit 
input  from  3GHz 
operating  band 

*  1  channel  at  70 
MSPS  14  bit 
output  to  3  GHz 
operating  band 
+20dBm 


•  Digital  =  ~80- 
240  GFLOPs 

♦  4x1  GHz  PPCs 

♦  =~  40  GFLOPs 

♦  4  X  Virtex  II  P40 
FPGAS 

♦  =~  40-200 
GFLOP 
equivalent 


10 


wm  jm  Computer  Systems,  Inc, 

AIe/^uj^ 

The  Ultimate  Performance  Machine 


Image 

Formation 


Radar 

Control 


Guidance 
&  Control 


Power 

Supply 


Ethernet 

Hub 


RS422 1 


I 


GPS 

Receiver 


©  2004  Mercury  Computer  Systems,  Inc. 


11 


1 


II 

III 


ADC 

PMC 


III 

III 


DAC 

PMC 


STALO 


RF  Up/ 
Down- 
converter 


Digital 

Receiver 


Quadrature 

Exciter 


RF 

Front-end 


RF  Control/Status/Power 

Weight  <  101b 
Cost  <  $60k 

Power  Consumption  <  150W 


wm  jm  Computer  Systems,  Inc, 

AIe/^uj^ 

The  Ultimate  Performance  Machine 


Beamformer/DF 

♦  COMINT 

♦  ESM 

♦  ELINT 


Digital 

Tuner 


User 

Display 


•  If  down- 
conversion 
added 


Tk 


Digital 

Tuner 


Digital 

Tuner 


Ethernet 

NGA 


System 

Host 


Digital 

Tuner 


©  2004  Mercury  Computer  Systems,  Inc. 


12 


©  2004  Mercury  Computer  Systems,  Inc. 


13 


m  M  Computer  Systems,  Inc.  - 

Merpjkt  3UlDe  ■ 


The  Ultimate  Performance  Machine 
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4x  Direct  high  speed  ‘digitai 
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PMC  site  for  digital  receiver 
or  modem  etc. 

FDK  2.0.x  support 
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Mei^UR^  MCP3 


The  Ultimate  Performance  Machine 


•  Combined  PowerPC  &  FPGA 

^  Flexibility  of  RISC  processing  code 

^  Density  and  bandwidth  handling 
strengths  of  FPGAs 

•  Deployable 

*  Ruggedized  &  conduction-cooled 

•  Multiple  I/Os  direct  to  FPGA 

^  4x  high-speed  bus  via  J2 

^  Dual-channel  analogue  input  digital 
receiver  PMC  option 


Early  Prototype 
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AIe/^uj^ 

The  Ultimate  Performance  Machine 


Analog  I/O  receiver 

*  2x80  MSPS14bitADC 

♦  Factory  configurable 

•  IF  up  to  100  MHz 


tA 

O 

a 

Q 

c 

o 

a 

m 

s 

75 

£ 


X 

LU 


Frequency 
Ref  I  n 


Synthesizer 


Band-Pass 

Filter 


Band-Pass 

Filter 


Band-Pass 

Filter 


Gain 


Band-Pass 

Filter 


Receiver  I 

Gating  I 

Frequency 
Ref.  Out 


PMC  general  features 

♦  Direct  interface  to  FPGA 
^  Stepped  attenuators 

♦  RF  screening 

♦  Ciocks  (int./ext.) 

♦  Power  managed 
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FPGA  Bus  (via  PMC  connector) 


recessing  Challenges 
in  Shrinking  HPEC 
Systems  into  Small 
Platforms 

Stephen  Pearce  &  Richard  Jaenicke 
Mercury  Computer  Systems,  Inc. 


High  Performance  Embedded  Computing  (HPEC)  Conference 

September  28,  2004 

The  Ultimate  Performance  Machine 
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Me^ur^ 

The  Ultimate  Performance  Machine 


I'*  ■ 

N 


UAV 

Global 

Hawk 

Pred¬ 
ator  B 

Heron 

A 

Hunter 

Eagle 

Eye 

Fire- 

Scout 

Sentry 

Dragon 

Warrior 

Dragon  Eye 

Picture 

IB 

Length  (ft) 

44.4 

36 

26 

22 

17 

23 

8.4 

10 

3 

Wingspan  (ft) 

116 

66 

54 

29 

17 

20 

12.8 

9 

3.8 

Height  (ft)  1 

1 14 

9.5 

5.9 

5.6 

5.5 

9.5 

4 

5 

1  i 

Payioad  Weight 

1000 

800 

550 

250 

200 

200 

75 

35 

5 

(lbs) 

Max  Altitude  (ft) 

65k 

50k 

25k 

15k 

20k 

20k 

15k 

4k 

1.2k 

Sensors  1 

1  EO/IR  SAR 

1  ISAR 

EO/IR 

SAR 

EO/IR 

SAR 

EO/IR 

SAR 

EO/IR 

SAR 

EO/IR  1 

SAR  1 

1  EO/IR 

EO/IR 

EO/IR  1 

1  SIGINT 

1  MTS 

ISAR 

SIGINT 

MTS 

ISAR 

SIGINT 

MTS 

ISAR 

MTS 

ISAR 

SIGINT 

MTS 

ISAR  1 

SIGINT  1 
MTS  1 

Endurance  (hrs) 

36 

36 

36 

10 

5 

4 

3 

3 

1 

Max  Airspeed 
(kts) 

320 

220 

120 

100 

220 

120 

100 

70 

35 

UAVs  height  is  very  smail;  tends  to  iead  to 
smaller  system  designs  than  6U  arrayed  on 
base  of  fuseiage/wings 
Payioad  weight  is  smaii,  thus  weight 
constrained  soiutions  are  demanded 
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UAVs  tend  to  fiy  fairiy  high.  A 
consequence  is  that  without  life  support 
environments  (no  man)  at  this  aititude, 
conduction  cooied  becomes  mandatory. 
Aii  traditionai  HPEC  appiications  are 
represented  on  aii  the  platforms. 
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The  Ultimate  Performance  Machine 


500  MHz  class  PPC  x  4 
=  16GFLOPS  per 
slot  => 

♦  6  slot=96  GFLOPS 

♦  12slot=192 
GFLOPS 

♦  20  slot=320 
GFLOPS 

Assumptions 

♦  FPGA=  Equivalent 
40-100  GFLOPS 

♦  500  MHz  PPC=4 
GFLOPS 


f  O  Ai 


Small 


2x  1GHz  class  PPC  per 
board  or  2  FPGA  per 
board=> 

♦  2slot=96-216GFLOPS 

♦  4  slot=1 12-61 6  GFLOPS 

♦  8  slot=224-1 232  GFLOPS 

=>  Future  FPGA  + 
PPC  exploitation  on 
3U  better  than 
existing  6U 


2-4x 

processing  - 
same  system 
dimensions 


Future  PPC-only 
Solutions  I 


4x1.5  GHz  class  PPC  =  48 
GFLOPS  per  slot  => 

♦  6  slot=288  GFLOPS 

♦  12  slot=576  GFLOPS  • 

♦  20  slot=960  GFLOPS 

=>  PPC  exploitation  of  VITA  46 


4x  1  GHz  class  PPC  per 
board  or  2  FPGA  per 
board=> 

♦  6  slot=1 92-1 032  GFLOPS 

♦  12  slot=384-2232  GFLOPS 

♦  20  slot=640-3832  GFLOPS 

=>FPGA  +  PPC 
exploitation  on  VME 
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